Search CORE

18 research outputs found

The Shapley Value of Classifiers in Ensemble Games

Author: Rozemberczki Benedek
Sarkar Rik
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/06/2021
Field of study

What is the value of an individual model in an ensemble of binary classifiers? We answer this question by introducing a class of transferable utility cooperative games called \textit{ensemble games}. In machine learning ensembles, pre-trained models cooperate to make classification decisions. To quantify the importance of models in these ensemble games, we define \textit{Troupe} -- an efficient algorithm which allocates payoffs based on approximate Shapley values of the classifiers. We argue that the Shapley value of models in these games is an effective decision metric for choosing a high performing subset of models from the ensemble. Our analytical findings prove that our Shapley value estimation scheme is precise and scalable; its performance increases with size of the dataset and ensemble. Empirical results on real world graph classification tasks demonstrate that our algorithm produces high quality estimates of the Shapley value. We find that Shapley values can be utilized for ensemble pruning, and that adversarial models receive a low valuation. Complex classifiers are frequently found to be responsible for both correct and incorrect classification decisions.Comment: Source code is available here: https://github.com/benedekrozemberczki/shaple

arXiv.org e-Print Archive

Edinburgh Research Explorer

Multi-scale attributed node embedding

Author: Allen Carl
Rozemberczki Benedek
Sarkar Rik
Publication venue: 'Oxford University Press (OUP)'
Publication date: 21/03/2021
Field of study

We present network embedding algorithms that capture information about a node from the local distribution over node attributes around it, as observed over random walks following an approach similar to Skip-gram. Observations from neighborhoods of different sizes are either pooled (AE) or encoded distinctly in a multi-scale approach (MUSAE). Capturing attribute-neighborhood relationships over multiple scales is useful for a diverse range of applications, including latent feature identification across disconnected networks with similar attributes. We prove theoretically that matrices of node-feature pointwise mutual information are implicitly factorized by the embeddings. Experiments show that our algorithms are robust, computationally efficient and outperform comparable models on social networks and web graphs.Comment: Published in the Journal of Complex Network

arXiv.org e-Print Archive

Edinburgh Research Explorer

Explainable Biomedical Recommendations via Reinforcement Learning Reasoning on Knowledge Graphs

Author: Edwards Gavin
Nilsson Sebastian
Papa Eliseo
Rozemberczki Benedek
Publication venue
Publication date: 07/10/2022
Field of study

For Artificial Intelligence to have a greater impact in biology and medicine, it is crucial that recommendations are both accurate and transparent. In other domains, a neurosymbolic approach of multi-hop reasoning on knowledge graphs has been shown to produce transparent explanations. However, there is a lack of research applying it to complex biomedical datasets and problems. In this paper, the approach is explored for drug discovery to draw solid conclusions on its applicability. For the first time, we systematically apply it to multiple biomedical datasets and recommendation tasks with fair benchmark comparisons. The approach is found to outperform the best baselines by 21.7% on average whilst producing novel, biologically relevant explanations

arXiv.org e-Print Archive

Chickenpox Cases in Hungary: A Benchmark Dataset for Spatiotemporal Signal Processing with Graph Neural Networks

Author: Ferenci Tamas
Kiss Oliver
Rozemberczki Benedek
Sarkar Rik
Scherer Paul
Publication venue
Publication date: 16/02/2021
Field of study

Recurrent graph convolutional neural networks are highly effective machine learning techniques for spatiotemporal signal processing. Newly proposed graph neural network architectures are repetitively evaluated on standard tasks such as traffic or weather forecasting. In this paper, we propose the Chickenpox Cases in Hungary dataset as a new dataset for comparing graph neural network architectures. Our time series analysis and forecasting experiments demonstrate that the Chickenpox Cases in Hungary dataset is adequate for comparing the predictive performance and forecasting capabilities of novel recurrent graph neural network architectures

arXiv.org e-Print Archive

Edinburgh Research Explorer